System and Method for Automated Risk Assessment for School Violence

ABSTRACT

A system and method for predicting risk of violence for an individual (primarily school violence, but not limited to school violence) performs the following steps: (a) receiving responses to questions from an individual; (b) extracting by a computerized annotator words or phrases from the questions and responses; (c) assigning by the annotator extracted word(s) or phrase(s) to at least one of a plurality of pre-defined categories; and (d) automatically identifying and scoring words or phrases that could be classified into the pre-defined categories by a trained machine-learning engine to produce a score reflecting relative risk of violence by the individual. The pre-defined categories include: expression of violent acts or thoughts of the individual; expression of negative feelings, thoughts or acts of others; expression of negative feelings, thoughts or acts of the individual; expression of family discord or tragedies; and expression of protective factors.

CROSS REFERENCE TO RELATED APPLICATIONS

The current application claims priority to U.S. Provisional application,Ser. No. 62/635,760, filed Feb. 27, 2018, the entire disclosure of whichis incorporated by reference.

BACKGROUND

Between July 2013 and June 2014, there were a total of 48school-associated violent deaths in the United States. Victims of theseschool-associated deaths included students, staff, and nonstudents. Ofthese deaths, 12 homicide victims and 8 suicide victims (42%) werebetween the ages of 5 and 8. In 2015, the rate of violent victimizationfor students ages 12-18 was higher at school than away from school [1,2]. School violence has a far reaching effect, impacting the entirety ofa school population including staff and students. Studies have shownthat there is poorer scholastic achievement and school attendance, alongwith higher dropout rates for youths attending the most violent schools[3-6].

There has been an increasing understanding of school-based crimeprevention, of effective prevention programs, and also of whatindividual risk factors and school-related characteristics relate tocrime [7, 8]. The most significant results of crime-prevention occurwhen youth at elevated risk are given a specific school-based preventionprogram [7, 9]. Although there has been progress in the area ofschool-violence prevention, much work is still necessary to improve thecurrent school violence risk assessment approach for mild, moderate, andsevere violence. To establish a more sensitive and effective method forassessing school violence risk levels, risk factors, and protectivefactors, a more sophisticated approach is needed [10].

Previously, a school-based violence risk assessment that carefullyevaluates the specific content of language used by students in middleand high school has not been widely accepted. Instead, students' risklevels are determined based on clinicians' impression during thestudents' risk assessment. No risk assessment approach has incorporateddirect analysis of students' interviews and therefore, yielded littleinformation to guide the work of threat assessment teams [11, 12]. Paperrisk assessments for violence, ranging from simple clinical impressionsto structured professional judgements, have been proposed but theircorrect identification rates of violent youth plateaued at less than 50%[13-16]. As such, a more sensitive school violence risk assessmentapproach may be achieved by understanding students' language.

SUMMARY

Current methods for school violence risk assessments are neithersensitive nor rapid, and have not been standardized. Compared toclinical impressions, using manual annotation could reduce clinicalsubjectivity in risk assessments. Manual annotation allows researchersto directly analyze the students' interviews and understand thebehaviors, attitudes, feelings, language, technology use, and otheractivities they mentioned during their interviews. With manualannotation, identification and understanding of risk and protectivefactors have been achieved more objectively and sensitively than withpreviously available violence risk assessment methods. In addition,manual annotation helps with the future development of a computerizedsystem (machine learning) that will automatically identify suchinformation within interviews [17-19]. The current disclosureincorporates machine learning to complete the annotation process. Insuch a system the violence risk assessment can be performedsubstantially in real-time during the student interview, for example, toprovide useful insights.

It is one aspect of the current disclosure to provide a system and amethod for predicting risk of violence for an individual (primarilyschool violence, but not limited to school violence). The system andmethod performs the following steps: (a) receiving responses toquestions from an individual in a digital form; (b) extracting by acomputerized annotator words or phrases from the digital form of thequestions and responses; (c) assigning by the annotator extractedword(s) or phrase(s) to at least one of a plurality of pre-definedcategories; and (d) automatically scoring words or phrases that could beclassified into the pre-defined categories by a trained machine-learningengine to produce a score reflecting relative risk of violence by theindividual. The pre-defined categories include: expression of violentacts or thoughts of the individual; expression of negative feelings,thoughts or acts of others; expression of negative feelings, thoughts oracts of the individual; expression of family discord or tragedies; andexpression of protective factors.

In a more detailed embodiment, the pre-defined categories also includeone or more of the following: expression of illegal acts or contact withthe judicial system by the individual, expression of violent media orvideo games, expression of self-harm thoughts or acts of the individual,expression of family discord or tragedies, expression of psychiatricdiagnosis or symptoms, and expression of positive feelings, thoughts oracts of the individual. In a further detailed embodiment, thepre-defined categories also include each of these additional categories.Alternatively, or in addition, the pre-defined categories also includeexpression of verbal or physical response due to emotions of theindividual.

In a more detailed embodiment, the questions to the individual weregiven from a pre-set questionnaire. In yet a further detailedembodiment, the questionnaire asks open-ended questions. In yet afurther detailed embodiment, the questionnaire is School Safety Scalequestionnaire, which is based on the Historical-Clinical RiskManagement-20 (HCR-20) questionnaire.

In a detailed embodiment, the trained machine-learning engine furthergenerates warning markers from the identified words or phrases. In afurther detailed embodiment, the warning markers are identification ofspecific assigned words or phrases, and/or generation of risk factorsfrom the assigned words or phrases. Alternatively, or in addition, thetrained machine-learning engine further considers demographic,socioeconomic status, social determinant, or environmental factor dataof the individual in the scoring step. Alternatively, or in addition,the scoring step utilizes a Pearson Correlation coefficient.Alternatively, or in addition, the annotator utilizes natural languageprocessing algorithms.

In another aspect, a system and method for assessing risk of violence(primarily, but not limited to, school violence) includes the followingsteps: receiving a digital natural language transcript of an individualin response to questions; converting the transcript to a predeterminedform; extracting linguistic features from the transcript; assigning theextracted features automatically to at least one of a plurality ofpre-determined categories; scoring the assigned features by machinelearning engine based on predetermined indicators; and producing a riskof violence score for the individual to prevent violence. In a moredetailed embodiment, the step of converting a digital natural languagetranscript to a predetermined form includes: tokenizing and lemmatizingthe digital transcript; removing punctuation from the transcript; and/orconverting negated and temporal terms. Alternatively or in addition, thequestions comprise an open-ended format.

In a detailed embodiment, the step of extracting features from thetranscript includes: extracting a first feature set into semanticmeaning categories; and extracting a second feature set of termfrequency and inverse transcript frequency. In a further detailedembodiment, the step of extracting into semantic meaning categoriesincludes: a linguistic inquiry and word count dictionary mappingspecific words into 45 categories; and using word embedding to clusterwords into 100 word categories. Alternatively, the step of extracting asecond feature set further includes both semantic and contextinformation.

In an alternate detailed embodiment, the predefined categories includes:expression of negative feelings, thoughts, or acts of others; expressionof positive feelings, thoughts, or acts of the individual; expression offamily discord or tragedies; expression of violent acts or thoughts bythe individual; and/or expression of protective factors. In a furtherdetailed embodiment, the predefined categories further include:expression of psychiatric diagnosis or symptoms; expression of negativefeelings, thoughts, or acts of the individual; expression of illegalacts or contact with the judicial system by the individual; engaging inviolent acts or expressing violent thoughts by the individual;expression of self-harm thoughts or engaging in self-harm acts by theindividual; and/or violent media or video games. In a further detailedembodiment, the pre-defined categories further include expression ofverbal or physical response due to emotions of the individual.

In an alternate detailed embodiment, the machine learning enginecomprises: multivariate logistic regression with L1 and L2normalization; support vector machines with polynomial and radial basisfunction kernels; artificial neural networks; decision trees; and/orrandom forests. In a further detailed embodiment, the machine leaningfurther comprises the use of a best-first search algorithm to select andidentify key second feature extractions most closely indicative of riskof school violence.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 provides an exemplary system for practicing embodiments of thecurrent disclosure;

FIG. 2 provides a flow diagram of an exemplary method according to anembodiment of the current disclosure;

FIG. 3 provides a flow diagram of an exemplary method according to anembodiment of the current disclosure;

FIG. 4 provides an AUC curve associated with a technology developmentexample provided herein; and

FIG. 5 provides a flow diagram of an exemplary method according to anembodiment and technology development example provided herein.

DETAILED DESCRIPTION

Embodiments of the current disclosure provide is an automated system(and associated methods) that assesses risk of school violence toaugment professional judgement. In some embodiments, the assessment mayoccur in real-time or substantially in real-time (i.e, within seconds orwithin several minutes). The technology includes protocol to interviewparticipants with two school safety scales. By analyzing the interviewcontent, this system automatically assesses participants' risk levelsand identifies risk factors that need to be addressed. The advantage ofthis technology is that it establishes an efficient and effective methodto assess adolescents who may be at risk for school violence. The systemmay help establish a comprehensive screening tool for school violenceprevention. Embodiments of the technology can be used in schools,hospitals, and clinics. It is also likely that this service can beapplied for background checks for adults who want to purchase guns.

While the detailed embodiments have been developed for use indetermining relative risk of school violence, it will be apparent thatthe disclosure may also be used for many related areas, such as:background checks for adults who may want to buy guns, emergency roomevaluations of children and adolescents, clinical evaluations atpediatrician offices and mental health agencies, evaluations withinjuvenile justice centers, criminal evaluations for the courts, custodyevaluations of parents for the courts, government use, colleges,graduate schools, and/or military use.

Participant Interview. A Demographic Form along with two 14-item schoolsafety scales, the Brief Rating of Aggression by Children andAdolescents (BRACHA) and the School Safety Scale (SSS) are provided tothe participant (e.g., student or juvenile). In addition, thePsychiatric Intake Response Center (PIRC) assessment, which has beenused at our hospital over the past decade, may also be provided to theparticipant. Collateral information may also gathered from theparticipant's guardians using some questions taken from the two schoolsafety scales, BRACHA and SSS, and the PIRC assessment.

The BRACHA questionnaire is a 14-item instrument, which has predictiveperformance for aggression by children and adolescents in CCHMC. The SSSquestionnaire was developed by modifying the Historical-Clinical-RiskManagement-20 (HCR-20) scale to evaluate risk and protective factorsassociated with school violence for children and adolescents. The HCR-20is a widely-used, valid and reliable scale for the prediction ofviolence by adults and includes ten historical items (irreversible),five current clinical items (reversible), and five future riskmanagement (reversible) items. The HCR-20, the BRACHA, and the SSSoverlap in areas that have been identified as important correlates toyouth violence. The PIRC assessment, which is not a scale, allows for asemi-structured format that helps gather background information withopen-ended and direct questions. The scales and assessment areconsistent with the information from five domains including community,school, peer, family, and individual. The PIRC assessment incorporatesthe FBI's Four-Pronged Assessment Model to examine the personality ofthe participant (behavior characteristics and traits), school dynamics,social dynamics, and family dynamics. Example questions can be found in[20], as follows:

TABLE 1 Abbreviated BRACHA 0.9 Items and Response Options* ItemAbbreviated BRACHA Items Response Options 1 Previous psychiatrichospitalization or day □ Yes □ No treatment placement 2 Schoolsuspension or expulsion □ Yes □ No 3 Trouble accepting adult authorityat home □ Little or none or at school □ Some □ A lot 4 Frequency ofphysical aggression toward □ Never others (e.g., hitting, kickingpunching, □ Occasionally biting, slapping, fights at school, throwing □Often objects at others) 5 Impulsiveness in the emergency department □No Incidents (e.g., often needing redirection, throwing □ One or moreincidents objects, running out of the room, yelling at the interviewer,extremely talkative, etc.) 6 Intrusiveness in the emergency department □No incidents (e.g., invading personal space, asking □ One or moreincidents personal questions, etc.) 7 Attempts to harm others or violentacts with □ Never intent to seriously harm others (includes all □ Onceweapons use, even without injury, if used □ More than once with harmfulintent) 8 Violent ideation towards others (i.e., □ Never thoughts,wishes, or desires to harm other □ Occasionally people) □ Often 9 Actualexpressions of violent intentions or □ Never plans to hurt others(includes text messages □ Occasionally and e-mails) □ Often 10 Acts thatintentionally destroyed property □ Never (c.g., breaking objects,vandalism, fire □ Occasionally setting, making holes in the walls; doesnot □ Often include accidents or throwing things) 11 Threats or physicalaggression towards self □ Yes □ No or others in the past 24 hours 12Pattern of either verbal or physical □ Yes □ No aggression towards selfor others 13 Aggressive behavior before age 10 years □ Never (e.g.firesetting, destruction of property, □ Occasionally stealing, trying toseriously hurt a person or □ Often animal, bullying, frequent fights;does not include lying) 14 Signs of remorse (such as responsibility, □Not aggressive, or if shame, and/or guilt) after violence or aggressive,displays remorse, aggressive acts guilt, shame, or responsibility □ Ifaggressive, displays no remorse, guilt, shame, or sense ofresponsibility

The School Safety Scale (SSS) employs a combination of both broad andspecific questions related to violence risk factors. Questions wereasked in an open-ended format so that the participant provides moredetailed answers than “yes” or “no.” In cases where the participantanswers with a yes or no, the participant is asked to explain his or heranswer. The wording of the questions may be dependent on the student'sage and cognitive level. Questions may also be asked based on backgroundinformation gathered before the interviews from the schools andguardians.

Annotation. In an embodiment, the interviews of the subjects arerecorded and immediately transcribed in a text or digital format. Keyphrases and words associated with school violence risk are extractedfrom the transcribed interviews, preferably using a trainedmachine-learning annotation engine. The focus during annotation is toidentify students' behaviors, attitudes, feelings, and uses oftechnology (social media and video games) with human disambiguation.School violence-related patterns are identified by the annotation enginewithin the students' interviews, and may be double annotated using adouble annotation schema. The trained annotation engine identifies keywords and phrases associated with students' behaviors from theinterviews. Each extracted word or phrase may be standardized by theannotation engine by being assigned to one of 11 pre-defined categories.The 11 categories for annotation are: (1) Psychiatric diagnosis orsymptoms, (2) Negative feelings, thoughts, or acts of subject, (3)Negative feelings, thoughts, or acts of others, (4) Positive feelings,thoughts, or acts of subject, (5) Illegal acts or contact with thejudicial system by subject, (6) Violent acts or thoughts of subject, (7)Self-harm thoughts or acts of subject, (8) Family discord or tragedies,(9) Verbal or physical response due to emotions of subject, (10) Violentmedia or video games, (11) Protective factors (e.g., family supportcounselor). After annotation was completed, a consensus may be manuallyestablished by a child and adolescent forensic psychiatrist to resolveany discrepancies in the annotation engine decisions.

During development of exemplary embodiments, a reference set of schoolviolence-related patterns was generated from the consensus of theextracted words and phrases and their corresponding categories. Thestatistical analysis on the annotations (manual annotations in thisexample) showed that there were significant language differences betweenthe high risk and low risk groups for the following five annotationcategories: negative feelings/thoughts/acts of subject, negativefeelings/thoughts/acts of others, illegal acts or contact with thejudicial system by subject, violent acts or thoughts of subject, andviolent media or video games (p values <0.05).

The following process is performed in a detailed embodiment of AutomatedRisk Assessment. The participant interviews are tokenized andlemmatized, after which the punctuations are removed and the negatedterms are converted. Two levels of features are then extracted from theprocessed interviews. The first feature set is created with wordcategories that summarized semantic meaning of the interviews. TheLinguistic Inquiry and Word Count (LWIC) dictionary is applied to mapspecific words into 45 categories related to positive/negative emotions,perceptions, personal concerns, and cognitive processes. Word embeddingis used to cluster words into 100 word categories (WEC) in anunsupervised manner. The second feature set is n-grams (n<5) with termfrequency-inverse document frequency weighting to capture both semanticand context information. To predict risk of school violence, thisembodiment uses three machine learning algorithms including multivariatelogistic regression with L1 and L2 normalization, support vectormachines with polynomial and radial basis function kernels, and randomforests. To capture potential warning markers associated with schoolviolence, this embodiment applies an iterative step-forward approachwith “best first” search to select and identify key n-grams. Byanalyzing the content from participant interviews, the machine learningdetect risk of school violence for individual students. In addition,identification of key predictors reveals multiple warning markers thatcould deliver useful insights into potential causes of school violence.

Embodiments of the current disclosure may provide two outcomes: a riskscore and/or a set of warning markers (i.e., text content of riskfactors). The risk score will inform if clinical intervention isrequired. However, to deliver useful insights into potential causes ofschool violence, warning markers may be identified from the interviewcontent. Embodiments of the current disclosure leverage machine learningtechnologies, including conditional random fields and recurrent neuralnetworks to detect linguistic descriptors (i.e., raw text of theannotation categories) associated with students' violent behaviors. Thedetected warning markers will help generate actionable recommendationsto inform individualized clinical and school interventions. Thecombination of the two outcomes may complete the attainment of thesystem's overall objective of automating school violence riskassessments.

FIG. 1 provides an exemplary system 100 that may be implemented topractice embodiments of the current disclosure. The system 100 mayinclude a Natural Language Processor and/or Annotation Engine 102 thatperforms the processes, as described herein in detail, associated withreceiving a digital transcript 104 of the individual's responses to theposed questions and for identifying and extracting words or phrases (orother forms of expression) 106 from the digital transcript. Theextracted words or phrases 106 are extracted based upon a trainedclassifier and/or other form of a rule-set contained in database 108.The extracted words/phrases 106 are received by the Categorizing &Scoring Engine 110, which assigns the extracted words/phrases 106 to oneor more pre-determined categories, as described herein in detail, andalso scores each of the assigned words/phrases. The Categorizing &Scoring Engine 110 compiles the categorized and weighted words/phrasesto produce an overall risk score 112, which is provided to a user'scomputerized and networked user interface device 114 via the Internet orsome other network 116. As shown in FIG. 1 , the digital transcript 104may be provided by the user's computerized and networked user interfacedevice 114 to the system 100 via the network 116. The Categorizing &Scoring Engine 110 may operate based upon trained classifiers and/orother forms of rule-sets contained in database 118.

The computerized and networked user interface device 114 can be in theform of a smart phone, a tablet computer, a laptop or desktop computer,smart display, personal assistant device, a computerized wearableappliance such as a smart watch or smart glasses, and the like. Thecomputerized and networked user interface device 114 may include adisplay 120, and a user input device such as a cursor control device 122(or a touch screen or a voice activated control, or a motion sensor, oran eye movement sensor and the like as are readily available to theart), a camera and associated processing circuitry. The computerized andnetworked user interface device 114 may operate to perform varioussoftware applications such as a computerized tool which may be in theform of a personal application associated with presenting thequestionnaires, receiving and recording the user's answers and forpresenting the risk scores and other output (as described herein) to theuser. In the current embodiment, the application may include a graphicaluser interface displayed on the display screen 120 and controlled and/orreceive user input therein from the user input devices such as thecursor-controlled device 122 and/or a touch screen. The user devicecircuitry may include a network circuit for connecting wirelessly withthe computer network 116 for the purpose of receiving and/ortransmitting data over the computer network 116.

The system 100 may utilize various computer servers and/or distributedcomputing devices also accessible thereto and may additionally includevarious data storage devices 108/118 operatively coupled by a dataconnection thereto. For example, the software application may includeoperations being performed on one or more of the computerservers/devices and/or on the device circuitry. Likewise, data storageassociated with the software application may be within one or more ofthe data storage devices 108/118 and/or on the device circuitry.

As shown in FIG. 2 , a method for predicting risk of violence for anindividual (primarily school violence, but not limited to schoolviolence) performs the following steps: (200) receiving responses toquestions from an individual in a digital form (e.g., receiving thedigital transcript 104); (202) extracting by a computerized annotatorwords or phrases from the digital form of the questions and responses;(204) assigning by the annotator extracted word(s) or phrase(s) to atleast one of a plurality of pre-defined categories; (206) automaticallyweighing words and/or phrases that could be classified into thepre-defined categories by a trained machine-learning engine; and (208)to produce a score 112 reflecting relative risk of violence by theindividual from a compilation of the categorized and scored words and/orphrases. In an embodiment, the risk score 112 is produced as a result ofa simple summation of the scored words and/or phrases. But it will beappreciated that there are numerous alternative ways to compile anoverall risk score from the scored words and/or phrases. The pre-definedcategories include: expression of violent acts or thoughts of theindividual; expression of negative feelings, thoughts or acts of others,expression of negative feelings, thoughts or acts of the individual,expression of family discord or tragedies, and expression of protectivefactors.

In a more detailed embodiment of FIG. 2 , the pre-defined categoriesalso include one or more of the following: expression of illegal acts orcontact with the judicial system by the individual, expression ofviolent media or video games, expression of self-harm thoughts or actsof the individual, expression of family discord or tragedies, expressionof psychiatric diagnosis or symptoms, and expression of positivefeelings, thoughts or acts of the individual. In a further detailedembodiment, the pre-defined categories also include each of theseadditional categories. Alternatively, or in addition, the pre-definedcategories also include expression of verbal or physical response due toemotions of the individual.

In a more detailed embodiment of the method of FIG. 2 , the questions tothe individual were given from a pre-set questionnaire. In yet a furtherdetailed embodiment, the questionnaire asks open-ended questions. In yeta further detailed embodiment, the questionnaire is School Safety Scale(SSS) questionnaire, which is based a modified version of theHistorical-Clinical Risk Management-20 (HCR-20) questionnaire. The BriefRating of Aggression by Children and Adolescents (BRACHA) questionnaireis a separate questionnaire that may be used alone or in addition to theSSS.

In an alternate detailed embodiment the method of FIG. 2 , the trainedmachine-learning engine further generates warning markers from theidentified words or phrases. In a further detailed embodiment, thewarning markers are identification of specific assigned words orphrases, and/or generation of risk factors from the assigned words orphrases. Alternatively, or in addition, the trained machine-learningengine further considers demographic, socioeconomic status, socialdeterminant, or environmental factor data of the individual in thescoring step. Alternatively, or in addition, the scoring step (206)utilizes a Pearson Correlation coefficient. Alternatively, or inaddition, the annotator utilizes natural language processing algorithms.

As shown in FIG. 3 , another embodiment provides a system and method forassessing risk of violence (primarily, but not limited to, schoolviolence) that includes the following steps: (300) receiving a digitalnatural language transcript of an individual in response to questions;(302) converting the transcript to a predetermined form; (304)extracting features from the transcript; (306) assigning the extractedfeatures automatically to at least one of a plurality of pre-determinedcategories; (308) scoring the assigned features by machine learningengine based on predetermined indicators; and (310) producing a risk ofviolence score for the individual to prevent violence. In a moredetailed embodiment, the step of converting a digital natural languagetranscript to a predetermined form includes: tokenizing and lemmatizingthe digital transcript; removing punctuation from the transcript; and/orconverting negated and temporal terms. Alternatively or in addition, thequestions comprise an open-ended format.

In a detailed embodiment of the method of FIG. 3 , the step (304) ofextracting features from the transcript includes: extracting a firstfeature set into semantic meaning categories; and extracting a secondfeature set of term frequency and inverse transcript frequency. In afurther detailed embodiment, the step of extracting into semanticmeaning categories includes: a linguistic inquiry and word countdictionary mapping specific words into 45 categories; and using wordembedding to cluster words into 100 word categories. Alternatively, thestep of extracting a second feature set further includes both semanticand context information.

In an alternate detailed embodiment of the method of FIG. 3 , thepredefined categories include: expression of negative feelings,thoughts, or acts of others; expression of positive feelings, thoughts,or acts of the individual; expression of family discord or tragedies;expression of violent acts or thoughts by the individual; and/orexpression of protective factors. In a further detailed embodiment, thepredefined categories further include: expression of psychiatricdiagnosis or symptoms; expression of negative feelings, thoughts, oracts of the individual; expression of illegal acts or contact with thejudicial system by the individual; engaging in violent acts orexpressing violent thoughts by the individual; expression of self-harmthoughts or engaging in self-harm acts by the individual; and/or violentmedia or video games. In a further detailed embodiment, the pre-definedcategories further include expression of verbal or physical response dueto emotions of the individual.

In an alternate detailed embodiment of the method of FIG. 3 , themachine learning engine comprises: multivariate logistic regression withL1 and L2 normalization; support vector machines with polynomial andradial basis function kernels; artificial neural networks; decisiontrees; and/or random forests. In a further detailed embodiment, themachine leaning further comprises the use of a best-first searchalgorithm to select and identify key features most closely indicative ofrisk of school violence.

Technology Development Example I

Researchers conducted evaluations on 103 (49% male to 51% female ratio)participants who were recruited from the Cincinnati Children's HospitalMedical Center inpatient units, outpatient clinics and Emergencydepartment. Participants ranged from ages 12-18 and were activelyenrolled in 74 traditional public schools (non-online education). Allparticipants were not in the custody of the state or county. Collateralfrom guardians was gathered prior to participant evaluation. Anopen-ended list of questions was used to initiate the evaluations. Eachparticipant was also asked questions from the Brief Rating of Aggressionby Children and Adolescents (BRACHA-School Version) and the SchoolSafety Scale (SSS). Evaluations were recorded and transcribed into textdocuments. Results: The 103 transcripts were annotated using a carefullycreated set of guidelines, where the keywords identified were placedinto one of twelve specific categories (e.g., “impulsivity”, “negativefeelings, thoughts or acts of subject” and “negative feelings, thoughtsor acts of others”). A Pearson Correlation coefficient was conducted,showing trending significance of “Risk to others” with five annotationcategories. “Negative feelings thoughts or actions of subject” (0.48),“Negative feelings thoughts or acts of others”(0.40), “Illegal acts orcontact with the Judicial system by subject” (0.31), “Violent media orvideo games”(0.44) and “Violent acts or thoughts of subjects”(0.68) allshowed a positive correlation. An unpaired T-test was conducted andresults for each of these categories were found to be significant at theP<0.01 level. By leveraging natural language processing and machinelearning technologies, we further developed a computerized model toautomatically analyze interview transcripts and predict if a student hashigh risk of violence towards others. The area under the ROC curveachieved by the model was 91.4%, indicating that more than 90% ofsubjects received.

Technology Development Example II

This study focused on developing a machine learning (ML) model topredict risk of school violence and to identify risk characteristics forindividual students.

Participants and inclusion criteria. During the study period weprospectively recruited 101 students from 74 middle and high schools inOhio, Kentucky, Indiana, and Tennessee. The students were directlyreferred to our risk assessment team from schools, or from inpatient andoutpatient clinics. We focused on referrals for students who 1) had anysignificant behavioral change and concern, verbal/physical aggression,or threats toward others or property, 2) had self-harm thoughts andbehaviors, and 3) had behavioral changes of becoming odd, quite,withdrawn, or isolative. All legal guardians consented and all studentsassented for the study (consent rate=100%).

School violence risk assessment. A risk assessment was completed as soonas possible from the initial referral. The research team interviewed astudent with three scales: 1) Brief Rating of Aggression by Children andAdolescents (BRACHA) that assesses aggression by children andadolescents, 2) School Safety Scale (SSS) that evaluates risk andprotective factors for school violence behaviors, and 3) PsychiatricIntake Response Center (PIRC) assessment that collects backgroundinformation including community, school, peer, family, and individual.All questions were asked in an open-ended format so that the studentwould provide more detailed answers than “Yes/No”. The wording of thequestions was dependent on the student's age and cognitive level. Theinterview was audio recorded and transcribed thereafter. Table 2 showsthe descriptive statistics of the scales and transcripts. By reviewingthe interview and collateral information from the guardians, theresearch team assessed the student's behaviors, attitudes, feelings, andtechnology use (e.g., social media) to determine their risk leveltowards others (low or high).

Predicting risk of school violence. In the study we sought to predictstudents' risk levels based on their interviews and home information.Clinical judgments for the 101 students served as gold standard to trainand evaluate the predictive models. The baseline was a predictive modelbased on home information including demographics and socioeconomicstatus (home income, education, and public assistance). The studentinterviews were tokenized and lemmatized, after which the punctuationswere removed and the negated terms were converted. We then extracted twolevels of features from the processed interviews. The first feature setwas created with word categories that summarized semantic meaning of theinterviews. The Linguistic Inquiry and Word Count (LWIC) dictionary wasapplied to map specific words into 45 categories related topositive/negative emotions, perceptions, personal concerns, andcognitive processes. We also used word embedding to cluster words into100 word categories (WEC) in an unsupervised manner. The second featureset was n-grams (n<5) with term frequency—inverse document frequencyweighting to capture both semantic and context information. To predictrisk of school violence, we used three standard machine learningalgorithms including multivariate logistic regression (LR) with L1 andL2 normalization, support vector machines with polynomial (SVM-P) andradial basis function (SVM-R) kernels, and random forests (RF). A nestedten-fold cross-validation was utilized in training and testing thealgorithms so that we could evaluate model performances on all 101examples. We used the area under the receiver operating characteristiccurve (AUC) as the primary measure for evaluation. To capture potentialwarning markers associated with school violence, we applied an iterativestep-forward approach (SVM-P) with “best first” search to select andidentify key n-grams.

Results. Based on clinical judgment, 54 students (53%) were consideredhigh risk towards others. There was no significant difference indemographics between the high risk and low risk groups. However, risk ofviolence increased significantly with a lower socioeconomic status(p-value=0.05 for public assistance and p-value=0.01 for householdincome under Chi-square test). Table 3 presents the performances of theML algorithms with different feature sets. The best-performing algorithmwas SVM-P with n-gram features from the BRACHA scale (AUC=93.03%), wherethe improvements over algorithms with individual feature sets (e.g.,individual word categories, n-grams from SSS and PIRC) werestatistically significant. FIG. 4 shows the AUC curve when incrementallyadding the first 500 n-gram features using best first search. Byutilizing only 21 n-gram features, a SVM classifier plateaued at an AUCof 98.02%. To identify violence warning markers, we present in Table 4predictors from the top 21 n-grams that were also significantlyassociated with school violence under Chi-square test (p-value<0.05).

Discussion and Conclusion. Compared with the baseline that used homeinformation, the ML algorithms leveraging interview content achievedsubstantially better performance in detecting participants' risk ofschool violence (Table 3). In addition, the n-gram features thatcaptured both semantic and context information were shown to be morepredictive than word categories. The PIRC assessment was used forcollecting background information and was therefore less predictive. Thefeature selection identified a set of key predictors that helped improvethe performance significantly to 98.02% (p-value=0.028). The majority ofpredictors were related to discussion of participants' past violentbehaviors (e.g., fight, threat to others). They also capturedparticipants' violent thoughts (e.g., want to hurt anyone, burn house)that could be warning signs of school violence.

By analyzing the content from participant interviews, the machinelearning models showed capacity for detecting risk of school violencefor individual students. In addition, identification of key predictorsrevealed multiple warning markers that could deliver useful insightsinto potential causes of school violence.

TABLE 2 Descriptive statistics of the scale questionnaires andtranscripts. Average Number Across all Interviews Words Words Ques- perper Ans./Que. Scale Items tions Words Question Answer Ratio BRACHA 14 25± 12  537 ± 324 12 ± 3 10 ± 8  0.78 ± 0.62 SSS 14 56 ± 22 1117 ± 541 11± 2 10 ± 7  1.04 ± 0.68 PIRC 22 25 ± 11  551 ± 280 10 ± 2 13 ± 10 1.43 ±1.04

TABLE 3 Classification performance of the machine learning algorithmswith different feature sets. Nested Ten-fold Cross ValidationPerformance [%] Features LR + L2 LR + L1 SVM-P SVM-R RF p-value*Demographics + socioeconomic status 67.89 68.79 67.57 62.92 62.14 3.0E−3LIWC WEC LR + L2 LR + L1 SVM-P SVM-R RF p-value* Word √ × 77.15 67.7776.95 73.01 73.72 1.2E−2 Category × √ 81.76 77.86 85.26 80.50 77.782.2E−2 √ √ 87.16 84.28 87.04 80.54 82.37 0.15 BRACHA SSS PIRC LR + L2LR + L1 SVM-P SVM-R RF p-value* N-gram √ × × 92.43 87.90 93.03 89.6091.06 N/A × √ × 81.91 73.72 81.52 82.03 74.03 3.5E−2 × × √ 60.36 67.7360.91 58.08 65.37 3.0E−3 √ √ × 90.50 80.02 90.11 87.71 83.88 0.24 √ × √88.93 80.73 89.36 84.91 84.12 0.24 × √ √ 84.71 77.03 83.88 82.43 75.306.2E−2 √ √ √ 91.06 78.25 91.33 86.49 83.90 0.29 *Paired T-test of theperformance difference between the best algorithm (SVM-P + BRACHAn-gram) and the others.

TABLE 4 Key n-gram predictors and their implication. Rank n-gramInterpretation & Examples 1 you ever From the question “have you evermake a make a write or verbal threat” 2 to hurt anyone From answers like“try/want to hurt anyone” 3 ever be in a From the question “have youever be in a fight fight” 4 house From answers like “burn down myhouse/blow your house up” 7 you want to From questions like “you want tokill kill her/people” 13 how From questions like “how many fights/howoften” 21 a while Answers like “once in a while″ for “how often do youget into fight”

Technology Development Example III

Methods: 103 participants were recruited through Cincinnati Children'sHospital Medical Center (CCHMC) from psychiatry outpatient clinics, theinpatient units, and the emergency department. Participants (ages 12-18)were active students in 74 traditional schools (i.e. non-onlineeducation). Collateral was gathered from guardians before participantswere evaluated. School risk evaluations were performed with eachparticipant, and audio recordings from the evaluations were latertranscribed and manually annotated. The BRACHA (School Version) and theSchool Safety Scale (SSS), both 14-item scales, were used. A template ofopen-ended questions was also used. Results: This analysis included 103participants whom were recruited from 74 different schools. Of the 103students evaluated, 55 were found to be moderate to high risk to othersand 48 were found to be low risk to others. Both the BRACHA and the SSSwere highly correlated with risk of violence to others (Pearsoncorrelations>0.82). There were significant differences in BRACHA and SSStotal scores between low risk and high risk to others groups (p-values<0.001 under un-paired t-test). In particular, there were significantdifferences in individual SSS items between the two groups (p-value<0.001). Of these items, Previous Violent Behavior (PearsonCorrelation=0.80), Impulsivity (0.69), School Problems (0.64), andNegative Attitudes (0.61) showed high correlation with risk to others.The novel machine learning algorithm achieved an AUC of 91.02% whenusing the interview content to predict risk of school violence, and theAUC increased to 91.45% when demographic and socioeconomic data wereadded. Conclusion: Our study indicates that the BRACHA and SSS areclinically useful for assessing risk for school violence. The machinelearning algorithm was highly accurate in assessing school violencerisk.

This work is an expansion of our previous child and adolescent violenceresearch from the hospital into the schools [16, 20, 21]. The design forthis study was approved by the institutional review board (IRB) atCincinnati Children's Hospital Medical Center (CCHMC, study ID:2014-5033). FIG. 5 presents a diagrammed overview of the study.

Students were evaluated using a school risk assessment protocoldeveloped in our earlier research [21]. When a student a major or minorbehavioral change or aggression towards self or others, the schoolmental health counselor contacted the risk assessment research team toset up a meeting time to discuss the reason for concern along thestudent's background information. The research team then contacted thestudents' guardians to discuss the study and possible participation ofthe student. Students were also recruited from the CCHMC's inpatientpsychiatry units, outpatient clinics, and the emergency department. Theresearch team obtained permission from the student's clinical team priorto discussing the study with the student and guardians. If all partiesagreed to participate, signed informed consent and assent forms wereobtained in person, along with signed releases of information forms.

After consent was received, the guardian was interviewed first to gathercollateral information that would help the research team better evaluatethe student. Some questions for the guardian were taken from the samerisk assessment questions used to evaluate the student. The collateralfrom the guardians was later used during the students' interview,allowing the interviewers to clinically determine the most effective wayto phrase the questions, as well as to compare the accuracy between whatwas reported by the guardian and student. Students were identified onlywith a subject number and their names were never spoken during therecorded interviews to help keep the students' identities anonymous. Theinterviews with the students were audio recorded after which therecordings were immediately placed on the secured computer and sent tobe transcribed.

Our research team provided clinical impressions and recommendations tothe school, without divulging the risk level of the student to avoidpossible stigma. We also provided recommendations to the guardians aswell as referrals for treatment when indicated. Currently there are novalidity data in a large U.S. prospective school research study. Hence,our research team documented risk levels to self or others (low ormoderate/high) based on clinical judgement rather than based onautomatic cutoff scores from the risk scales. This study defined a “highrisk” student as a student for whom a moderate or high risk level wasgiven for physical aggression at school.

Participants. We prospectively recruited 103 middle and high schoolstudents who were from 74 schools from Ohio, Kentucky, Indiana, andTennessee during this study period. All legal guardians consented andall students assented after meeting in person and discussing the study(consent/assent rate=100%). Based on clinical judgments, 47 students(45.6%) were considered high risk to self among 103 students. Fifty fivestudents (53.4%) were considered high risk to others among the 103students. Table 1 shows the characteristics of the participants.Forty-eight percent were male. The moderate/high risk to others grouponly significantly differed from the low risk to other group in regardsto home income (p<0.05). Although these two groups differed with theproportion of included Hispanic participants, the finding warrantsfurther investigation due to having a total of only seven Hispanicparticipants. There was no significant difference in high risk and lowrisk groups in the amount of students recruited from the inpatient andoutpatient units (p=0.79).

Inclusion Criteria. Participants of this study were between the ages of12 and 18. They were required to be enrolled in school, excluding thosethat are homeschooled or enrolled in online school. Participants werealso not to be in state custody. Participants of all races andsocioeconomic standings were included. Participants' legal guardiansprovided informed consent in person and gave permission (releases ofinformation) for the collection of information from schools, theproviding of information to schools, and for the evaluation of thestudent. Assent was also obtained from the student prior to theinterview. For this study, students were either directly referred to theresearch team from their school or recruited from our inpatient units,outpatient clinic, or emergency department. A referral was made whenthere was any concern for the student due to a significant behavioralchange or verbal/physical aggression. Referrals were also made whenstudents displayed behavioral changes of becoming quiet, withdrawn, orisolated which can be warning signs of school violence. All students,regardless of possible risk level, were included. Students who had aprimary concern of self-harm thoughts or behaviors were also included.

Materials. For each student, a Demographic Form along with two 14-itemschool safety scales, the Brief Rating of Aggression by Children andAdolescents (BRACHA) and the School Safety Scale (SSS) were used [16,21]. In addition, the Psychiatric Intake Response Center (PIRC)assessment, which has been used at our hospital over the past decade,was also used. Collateral information was gathered from the guardiansusing questions taken from the two school safety scales, BRACHA and SSS,and the PIRC assessment.

The BRACHA is a 14-item instrument, which has predictive performance foraggression by children and adolescents in the hospital [16, 20]. The SSSwas developed by modifying the Historical-Clinical-Risk Management-20(HCR-20) scale [22]. The HCR-20 is a widely used, valid, and reliablescale for the prediction of violence by adults and includes tenhistorical items (irreversible), five current clinical items(reversible), and five future risk management (reversible) items [22].The SSS's original purpose was to assess risk and protective factors forpotential against medical advice discharges for children and adolescentsfrom the psychiatry units [23]. The HCR-20, the BRACHA, and the SSSoverlap in areas that were previously identified as important correlatesto youth violence [24]. The BRACHA and SSS have both been used for abouta decade at our hospital, yet they have not been used for school safetyprior to our research [21].

The PIRC assessment, which is not a scale, allowed for a semi-structuredformat that helped us to gather background information with open-endedand direct questions. The scales and assessment used in our study areconsistent with the information from five domains including community,school, peer, family, and individual [8]. The PIRC assessmentincorporated the FBI's Four-Pronged Assessment Model to examine thepersonality of the student (behavior characteristics and traits), schooldynamics, social dynamics, and family dynamics while we interviewed thestudent and gathered collateral information from relevant sources [25].

The school risk assessment employed a combination of both broad andspecific questions related to violence risk factors. Questions wereasked in an open-ended format so that the student would provide a moredetailed answer than “yes” or “no”. In cases where the student didanswer with a yes or no, he or she was asked to explain his or heranswer. The wording of the questions was dependent on the student's ageand cognitive level. Questions were also asked based on backgroundinformation gathered before the interviews from the schools andguardians.

Manual Annotation and Statistical Analysis. The interviews of thesubjects were recorded and immediately transcribed in a text format. Keyphrases and words associated with school violence risk were manuallyextracted from the transcribed interviews. The sole focus duringannotation was to identify students' behaviors, attitudes, feelings, anduses of technology (social media and video games) [24].

School violence-related patterns were identified within the students'interviews, and were double annotated using a double annotation schema[26]. Double annotation assures a more uniform measurement than a singleannotation and is a standard method used in clinical research [27].Annotators first identified key words and phrases associated withstudents' behaviors from the interviews by following a guidelinedeveloped by the research team. Each extracted word or phrase wasstandardized by being assigned to one of 11 pre-defined categories.After annotation was completed, a consensus was established by a childand adolescent forensic psychiatrist to resolve any discrepanciesbetween the annotators' decisions. A reference set of schoolviolence-related patterns was generated from the consensus of theextracted words and phrases and their corresponding categories.

Statistical analysis was performed to assess the relationship betweenthe BRACHA and the SSS scores, the key linguistic factors, and the riskof school violence. To demonstrate the distribution of demographics,descriptive analysis was performed (See Table 5). Two sample t testcompared the numerical variables (e.g., SSS) between high risk and lowrisk students. A ×2 test was used to compare the categorical variables(e.g., linguistic patterns). Pearson's correlation coefficients wereperformed to assess the association between the BRACHA/SSS scores, thelinguistic factors, and the risk levels.

The 11 categories for annotation are: (1) Psychiatric diagnosis orsymptoms, (2) Negative feelings, thoughts, or acts of subject, (3)Negative feelings, thoughts, or acts of others, (4) feelings, thoughts,or acts of subject, (5) Illegal acts or contact with the judicial systemsubject, (6) Violent acts or thoughts of subject, (7) Self-harm thoughtsor acts of subject, (8) Family discord or tragedies, (9) Verbal orphysical response due to emotions of subject, (10) Violent media orvideo games, (11) Protective factors (e.g., family support, counselor).

Results. The total BRACHA score and the total SSS score weresignificantly correlated with the risk to others (p values <0.05) butnot with risk to self. More specifically, the following items within theSSS were significantly associated with school violence risk to others:violent thoughts, impulsivity, compliance with treatment, insight,support, school problems, substance use, psychopathy, negativeattitudes, active symptoms of mental illness, stressors, violentbehavior, and access to weapons (p values<0.05). Additionally, theanalysis on the annotations (Table 6) showed that there were significantlanguage differences between the high risk and low risk groups for thefollowing five (out of 11) annotation categories: negativefeelings/thoughts/acts of subject, negative feelings/thoughts/acts ofothers, illegal acts or contact with the judicial system by subject,violent acts or thoughts of subject, and violent media or video games (pvalues <0.05).

We piloted machine learning, which leverages advanced computerizedalgorithms, to analyze student interviews and predict students' risk ofschool violence. Given the content of transcribed interviews, amultivariate logistic regression algorithm with L2 normalization wasused for predicting risk of school violence for individual students.Nested ten-fold cross-validation was utilized to train and evaluate thelogistic regression algorithm, where the model parameters were optimizedwith grid search parameterization. For model evaluation, we used thearea under the receiver operating characteristic curve (AUC) as theprimary measure. The algorithm achieved an AUC of 91.02% when using theinterview content to predict risk of school violence, and the AUCincreased to 91.45% when demographic and socioeconomic data were added.

In addition to multivariate logistic regression with L2 normalization,embodiments may use other machine learning algorithms including naïveBayes, logistic regression with different normalizations such as L1normalization, gradient boosted variants (e.g., gradient boostedmachine, gradient boosted logistic regression), support vector machine,classification and regression trees, random forests, artificial neuralnetworks, and deep learning variants (e.g., convolutional neuralnetworks and recurrent neural networks).

In addition to AUC, embodiments may use other evaluation metricsincluding positive predictive value, sensitivity, negative predictivevalue, specificity, and F-measure.

Discussion. The high AUC achieved by machine learning suggests itspotential to assist school violence risk assessment which could helpminimize clinical subjectivity and maximize predictive validity inclinical practice [28-31]. The other problem with current methods forassessing violence risk in schools is that there is not a widelyaccepted risk assessment [32]. There are several violence riskassessments for children and adolescents, but they have been developedfor use in hospitals [16, 20] and within the juvenile justice system[32]. With these considerations in mind, our research team developednovel school safety scales, the BRACHA and SSS, in order to assistschools in being able to assess risk within a reasonable amount of time(30 minutes on average) including gathering collateral information fromthe school and legal guardian. We used these scales in order todetermine the gold standard risk level for each subject and to begin.

Although independent of race, school violence risk level was dependenton home income level [33]. Risk of violence towards others was inverselyrelated to a lower socioeconomic status in our study. Previous researchhas correlated poverty, aggression, and disruptive behaviors [34]. Racewas not correlated with school violence risk level in this study whichis similar to our past findings for hospital aggression [16].Surprisingly, being male was not associated with school violence risk toothers level in contrast to our previous findings [16].

The total scores of the BRACHA and SSS as well as 13 of the 14 items ofthe SSS were significantly correlated with the overall judgment onschool violence risk to others but not with risk to self. Importantly,these results provide evidence that the SSS and BRACHA can be clinicallyused to specifically assess risk of violence to others at schools. Thesefindings were expected since we methodically gathered aggressionhistories with the BRACHA and SSS [16, 21].

Five of the eleven annotation categories were significantly associatedschool violence risk levels. Of these five significant categories, themost interesting finding was that negative feelings, thoughts, or actsof others were significantly associated with violent risk to otherssimilar to our first study [21]. This category captures theparticipant's perceptions of others intentions or actions. For thiscategory, it is likely that our annotation process was able to detectsubtle negative misinterpretations, cognitive distortions, beingbullied, or actual emotional or physical abuse by adults towards theparticipants.

In a large comparative study of violence risk assessment tools including68 studies and participants, the AUC's of the most widely used riskassessments ranged from 0.54 to with median AUC's ranging from 0.66 to0.78 [35]. More specifically, the gold standard violence risk assessmentfor adults (HCR-20) had a median AUC of 0.70 (with 8 studies) while thegold standard risk assessment for adolescents, the SAVRY, had a medianAUC of 0.71 (with 8 studies) [35]. For our study, the machine learningAUC was 0.91 which is superior to the AUC's of adolescent and adultviolence risk assessment [35]. This striking finding provided evidencethat the machine learning algorithm (based only on the transcript of theparticipant's interview) was almost as accurate in assessing risk levelsas a full assessment by our research team including gathering collateralfrom parents and the school, review of records when available, andscoring the SSS and BRACHA.

Although our sample size was large enough to develop the algorithm formachine learning, we estimate that we will require 336 subjects to fullydevelop the automated system for use in schools. Another limitation wasthat we did not have prospective aggression data from schools for oursubjects. Given our funding and resource limitations, collectingprospective aggression data was not feasible. Since past violence is thestrongest risk factor for future violence [36, 37], we ensured that ourrisk assessments meticulously collected violence and behavioralhistories from the students, guardians, and schools. Another limitationwas that we aimed to prevent mild, moderate, and severe school violencerather than developing a research method to solely prevent schoolshootings. Since the base rate of school shootings is low, we did notspecifically aim to develop automatic text-based data analysis toprevent these rare events [38-40]. Our risk assessments were focused onpreventing any type of physical aggression towards others at schools.Nevertheless, we evaluated risk factors with the BRACHA and SSS thatcould be highly relevant to potential school shootings.

Conclusions. Our pilot study provided evidence for the potential use ofmachine learning to augment structured professional judgment whenassessing for school violence risk. In this study, we compared the risklevels determined by the research team based on outside information andrecord review (when available) versus risk levels determined by themachine learning analysis of the transcribed student interviews. Themachine learning algorithm was accurate (AUC=91%) with assessing schoolviolence risk when compared to structured professional judgmentapproach. In the next two years, we expect that the machine learningalgorithm will become more accurate with even higher AUC's. Ultimately,our goal is to spread the use of the machine learning technology toschools in the future to augment structured professional judgment tomore effectively prevent school violence.

TABLE 5 Demographics Moderate- Low Risk to High Risk to Low Risk toModerate-High Characteristic Self Self Others Risk to Others Sex Female19 34 28 25 Male 37 13 20 30 Race Caucasian 40 33 33 40 African American7 8 6 9 Biracial 8 4 7 5 Asian 1 2 2 1 Hispanic No 54 42 42 54 Yes 2 5 61 Public Assistance No 47 32 41 38 Yes 9 15 7 17 Annual Household IncomeLess than $20,000 11 14 8 17 $20,001-$40,000 16 22 15 23 $40,001-$60,0005 5 7 3 $60,001-$90,000 12 0 4 8 More than $90,000 12 6 14 4 Education 74 8 3 Advanced Graduate/ Professional Degree College Graduate 13 13 1511 Some College 22 13 10 25 High School Grad/GRE 8 15 11 12 Some HighSchool 6 2 4 4 X2-Test Self-Risk (p-value) Risk to Others (p-value) Sex 0.0001* 0.1920 Race 0.6525 0.6895 Hispanic 0.1558  0.0317* PublicAssistance 0.0582 0.0506 Home Income 0.0055  0.0113* Home Education0.1631 0.0630

TABLE 6 Annotation Categories and Risk to Others Unpaired Pearson T-testCategory Correlation (p-value) Psychiatric diagnosis or symptoms −0.0350.722 Negative feelings, thoughts, or acts of subject 0.480 2.95E−07Negative feelings, thoughts or acts of others 0.401 2.70E−05 Illegalacts or contact with judicial system by 0.314 0.00123 subject Self-harmthoughts or acts of subject −0.0856 0.390 Verbal or physical responsedue to emotions −0.113 0.256 of subject Family discord or tragedies0.192 0.0521 Positive feelings, thoughts, or acts of subject 0.04150.677 Violent media or video games 0.443 2.83E−06 Violent acts orthoughts of subject 0.676 4.48E−15 Protective factors (e.g., familysupport, −0.0551 0.580 counselor)

It will be appreciated that the disclosed embodiments are only exemplaryin nature and it will be within the scope of those of ordinary skill tomodify the disclosed embodiments without departing from the scope of theinvention(s) as claimed. Further, while objects and advantages of thedisclosure are provided herein, it will be apparent that it is notnecessary to meet such objects and advantages to practice theinvention(s) as claimed, since the disclosure may provide additionalobjects and advantages that have not been expressly disclosed herein.

TABLE OF REFERENCES

References [1]-[40] are listed below. The disclosure of each isincorporated by reference.

-   1. Musu-Gillette, Zhang, A., Wang, K., Zhang, J., and    Oudekerk, B. A. (2017). Indicators of School Crime and Safety: 2016.    National Center for Education Statistics, U.S. Department of    Education, and Bureau of Justice Statistics, Office of Justice    Programs, U.S. Department of Justice. Washington, DC.-   2. Centers for Disease Control and Prevention (CDC), 1992-2014    School-Associated Violent Death Surveillance System (SAVD-SS),    retrieved July 2016 from    http://www.cdc.gov/injury/wisqars/index.html; and Federal Bureau of    Investigation and Bureau of Justice Statistics, Supplementary    Homicide Reports (SHR), preliminary data (August 2016).-   3. National Association of School Psychologists. (2010). Crisis and    safety resources. Retrieved Apr. 3, 2014 from    http://www.nasponline.org/educators/index.aspx #crisis.-   4. McCoy D C, Roy A L, Sirkman G M. Neighborhood crime and school    climate as predictors of elementary school academic quality: a    cross-lagged panel analysis. Am J Community Psychol. 2013 September;    52(1-2):128-40. doi: 10.1007/s10464-013-9583-5.

Burdick-Will J. School Violent Crime and Academic Achievement inChicago. Social Educ. 2013 October; 86(4). doi:10.1177/0038040713494225.

-   6. Strom I F, Thoresen S, Wentzel-Larsen T, Dyb G. Violence,    bullying and academic achievement: a study of 15-year-old    adolescents and their school environment. Child Abuse Negl. 2013    April; 37(4):243-51. doi: 10.1016/j.chiabu. 2012.10.010. Epub 2013    Jan 6.-   7. Gottfredson G D, Cook P J, NA C: Schools and Prevention. In:    Welsh B C, Farrington D P (Eds): Crime and Prevention. Oxford,    United Kingdom: Oxford University Press, pp. 269-287, 2000.-   8. Tanner-Smith E E, Wilson S J, Lipsey M W: Risk Factors and Crime.    In: Maguire M, Morgan R, Reiner R (Eds) The Oxford Handbook of    Criminology. 5th edn, Oxford, Oxford University Press, pp. 89-111,    2012.-   9. Mytton J, DiGuiseppi C, Gough D, Taylor R, Logan S. School-based    secondary prevention programmes for preventing violence. Cochrane    Database Syst Rev. 2006 Jul. 19; (3): CD004606.-   10. Park-Higgerson H K, Perumean-Chaney S E, Bartolucci A A, Grimley    D M, Singh K P. The evaluation of school-based violence prevention    programs: a meta-analysis. J Sch Health. 2008 Sep; 78(9):465-79;    quiz 518-20. doi: 10.1111/j. 1746-1561.2008.00332.x.-   11. Borum R, Cornell D G, Modzeleski W, Jimerson S R: What can be    done about school shootings? A Review of the Evidence. Educational    Researcher 39(1): 27-37, 2010.-   12. Nekvasil E K, Cornell D G: Student reports of peer threats of    violence: Prevalence and outcomes. Journal of School Violence 11(4):    357-375, 2012.-   13. Bernes K B, Bardick A D: Conducting adolescent violence risk    assessments: A framework for school counselors. Professional School    Counseling 10(4): 419-427, 2007.-   14. McGowan M R, Horn R A, Mellott R N: The predictive validity of    the structured assessment of violence risk in youth in secondary    educational settings. Psychological Assessment 23(2): 478-486, 2011.-   15. Monahan J, Steadman H: Violence Risk Assessment: A Quarter    Century of Research. In: Frost L, Bonnie R (Eds.): The Evolution of    Mental Health Law. Washington: American Psychological Association,    pp. 195-211, 2001. doi:10.1037/10414-010.-   16. Barzman D, Brackenbury L, Sonnier L, Schnell B, Cassedy A,    Salisbury S, Sorter M, Mossman D: Brief rating of aggression by    children and adolescents (BRACHA): Development of a Tool to Assess    Risk of Inpatients' Aggressive Behavior. Journal of the American    Academy of Psychiatry and the Law 39(2): 170-179, 2011.-   17. Xia F, Yetisgen-Yildiz: Clinical corpus annotation: challenges    and strategies. Proc. Of Third Workshop on Building and Evaluating    Resources for Biomedical Text Mining of the International Conference    on Language Resources and Evaluation, 2012.-   18. Kors J A, Clematide S, Akhondi S A, van Mulligen E M,    Rebholz-Schuhmann D. A multilingual gold-standard corpus for    biomedical concept recognition: the Mantra G S C. J Am Med Inform    Assoc. 2015 September; 22(5):948-56. doi: 10.1093/jamia/ocv037.-   19. Wilbur W J, Rzhetsky A, Shatkay H. New directions in biomedical    text annotation: definitions, guidelines and corpus construction.    BMC Bioinformatics. 2006 Jul. 25; 7:356.-   20. Barzman D, Mossman D, Sonnier L, Sorter M: Brief rating of    aggression by children and adolescents (BRACHA): A reliability    study. Journal of the American Academy of Psychiatry and the Law    40:374-382, 2012.-   21. Barzman, D. H., Ni, Y., Griffey, M., Patel, B., Warren, A.,    Latessa, E., & Sorter, M. (2017). A Pilot Study on Developing a    Standardized and Sensitive School Violence Risk Assessment with    Manual Annotation. Psychiatric Quarterly, 88(3), 447-457.-   22. Douglas K S, Blanchard A J E, Guy L S, Reeves K A, Weir J    (2010). HCR-20 Violence Risk Assessment Scheme: Overview and    Annotated Bibliography. Retrieved from    http://kdouglas.files.wordpress.com/2007/10/hcr-20-annotated-biblio-sept-2010.pdf.-   23. Delgado S V, Barzman D, Gehle M, Caring M, Sorter M D, Kowatch    R, Finding R: Characteristics of Discharges Against Medical Advice    from Acute Inpatient Psychiatric Units for Children and Adolescents.    Poster presented at the annual meeting of the American Academy of    Child and Adolescent Psychiatry, Boston, 2007.-   24. Hilterman E L, Nicholls T L, van Nieuwenhuizen C: Predictive    performance of Risk Assessments in Juvenile Offenders: Comparing the    SAVRY, PCL:YV, and YLS/CMI With Unstructured Clinical Assessments.    Assessment, 2014. Federal Bureau of Investigation. (1999). The    School Shooter: A Threat Assessment Perspective. (Federal Bureau of    Investigation, ED446352). Quantico V A. Retrieved from    http://www.fbi.gov/library/school/schoo12.pdf.-   26. Lingren T, Deleger L, Molnar K, Zhai H, Meinzen-Derr J, Kaiser    M, Stoutenborough L, Li Q, Solti I: Evaluating the impact of    pre-annotation on annotation speed and potential bias: Natural    language processing gold standard development for clinical named    entity recognition in clinical trial announcements. Journal of the    American Medical Informatics Association, 2013.    doi:10.1136/amiajn1-2013-001837.-   27. Deleger, L., K. Molnar, G. Savova, F. Xia, T. Lingren, Q. Li, K.    Marsolo, et al. 2012. “Large-scale evaluation of automated clinical    note de-identification and its impact on information extraction.”    Journal of the American Medical Informatics Association: JAMIA 20    (1): 84-94.-   28. Ganzert, S., Guttmann, J., Kersting, K., Kuhlen, R., Putensen,    C., Sydow, M., & Kramer, S. (2002). Analysis of respiratory    pressure-volume curves in intensive care medicine using inductive    machine learning. Artificial intelligence in medicine, 26(1), 69-86.-   29. Zacharaki, E. I., Wang, S., Chawla, S., Soo Yoo, D., Wolf, R.,    Melhem, E. R. & Davatzikos, C. (2009). Classification of brain tumor    type and grade using MRI texture and shape a machine learning    scheme. Magn Reson Med, 62(6):1609-18.-   30. Zrimec, T., & Kononenko, I. (2004). Feasibility analysis of    machine learning medical diagnosis from aura images. In Proc. Int.    Conf. KIRLIONICS-98 (Abstracts) (pp. 10-11).-   31. Sara, NB, Halland R, Igel C, Alstrup S High-School Dropout    Prediction Using Machine Learning: A Danish Large-scale Study    European Symposium on Artificial Neural Networks, Computational    Intelligence and Machine Learning. Bruges (Belgium), 22-24 Apr.    2015.-   32. Welsh J L, Schmidt F, McKinnon L, Chattha H K, Meyers J R A    Comparative Study of Adolescent Risk Assessment Instruments    Predictive and Incremental Validity Assessment. 2008 Mar;    15(1):104-15.-   33. Molnar, B. E., Cerda, M., Roberts, A. L., & Buka, S. L. (2008).    Effects of neighborhood resources on aggressive and delinquent    behaviors among urban youths. American Journal of Public Health, 98,    1086-1093. doi:10.2105/AJPH.2006.098913-   34. Reed M. O., Jakubovski E., Johnson J. A., & Bloch M. H.    Predictor of long-term school-based behavioral outcomes in the    multimodal treatment study of children with    attention-deficit/hyperactivity disorder. J Child Adolesc    Psychopharmacol, 27(4): 296-309.-   35. Singh, J. P., Grann M., & Fazel S. A comparative study of    violence risk assessment tools: a systematic review and    metaregression analysis of 68 studies involving 25,980 participants.    Clin Psychol Rev, 31: 499-513, 2011.-   36. Mossman, D. Assessing prediction of violence: being accurate    about accuracy. J Consult Clin Psychol, 62 (4):783-792, 1994.-   37. Janofsky, J. S., Spears, S., & Neubauer, D. N. (1988).    Psychiatrists' accuracy in predicting violent behavior on an    inpatient unit. Hospital and Community Psychiatry, 39, 1090-1094.-   38. Neuman, Y., Assaf, D., Cohen Y., & Knoll, J. Profiling school    shooters: automatic text-based analysis. Frontiers in Psychiatry 6:    1-5, 2015.-   39. Shultz, J. M., Cohen, A. M., Muschert, G. W., and Flores de    Apodaca, Roberto. Fatal school shootings and the epidemiological    context of firearm mortality in the United States Disaster Health.    2013 Apr-Dec; 1(2): 84-101.-   40. Flannery D J, Modzeleski W, Kretschmar J M. Violence and school    shootings. Curr Psychiatry Rep. 2013 January; 15(1):331.-   41. Varma S, et al. Bias in error estimation when using    cross-validation for model selection. Bioinformatics. 2006; 7:91.

1. A method for predicting risk of violence, comprising the steps of:receiving responses to questions from an individual in a digital form;extracting by a computerized annotator words or phrases from the digitalform of the questions and responses; assigning by the annotatorextracted words and/or phrases to at least one of a plurality ofpre-defined categories, the pre-defined categories including: expressionof violent acts or thoughts of the individual, expression of negativefeelings, thoughts or acts of others, expression of negative feelings,thoughts or acts of the individual, expression of family discord ortragedies, and expression of protective factors; and automaticallyscoring words or phrases that could be classified into the pre-definedcategories by a trained machine-learning engine to produce a scorereflecting relative risk of violence by the individual.
 2. The method ofclaim 1, wherein the pre-defined categories also include one or more ofthe following: expression of illegal acts or contact with the judicialsystem by the individual, expression of violent media or video games,expression of self-harm thoughts or acts of the individual, expressionof family discord or tragedies, expression of psychiatric diagnosis orsymptoms, and expression of positive feelings, thoughts or acts of theindividual.
 3. The method of claim 2, wherein the pre-defined categoriesalso include each of the following: expression of illegal acts orcontact with the judicial system by the individual, expression ofviolent media or video games, expression of self-harm thoughts or actsof the individual, expression of family discord or tragedies, andexpression of psychiatric diagnosis or symptoms, and expression ofpositive feelings, thoughts or acts of the individual.
 4. The method ofclaim 2, wherein the pre-defined categories also include: expression ofverbal or physical response due to emotions of the individual.
 5. Themethod of claim 1, wherein the questions to the individual were givenfrom a pre-set questionnaire.
 6. The method of claim 5, wherein thequestionnaire asks open-ended questions.
 7. The method of claim 6,wherein the questionnaire is based upon the Historical-Clinical RiskManagement-20 (HCR-20) questionnaire.
 8. The method of claim 6, whereinthe questionnaire is based upon the Brief Rating of Aggression byChildren and Adolescents (BRACHA) questionnaire.
 9. The method of claim6, wherein the questionnaire is based on a combination of theHistorical-Clinical Risk Management-20 (HCR-20) questionnaire and theBrief Rating of Aggression by Children and Adolescents (BRACHA)questionnaire.
 10. The method of claim 1, wherein the trainedmachine-learning engine further generates warning markers from theidentified words or phrases.
 11. The method of claim 10, wherein thewarning markers are one or more of: identification of specific assignedwords or phrases, or generation of risk factors from the assigned wordsor phrases.
 12. The method of claim 1, wherein the trainedmachine-learning engine further considers demographic, socioeconomicstatus, social determinant, or environmental factor data of theindividual in the scoring step.
 13. The method of claim 1, wherein theassigning step utilizes a double annotations schema.
 14. The method ofclaim 1, wherein the scoring step utilizes a Pearson Correlationcoefficient.
 15. The method of claim 1, wherein the annotator utilizesnatural language processing algorithms.
 16. The method of claim 1,wherein the individual is a juvenile and the score reflects relativerisk of school violence by the juvenile. 17-27. (canceled)
 28. Themethod of claim 16, wherein the individual is a juvenile and theviolence score reflects relative risk of school violence by thejuvenile.